rep_sample_n(coaches,
size = 25,
reps = 1,
replace = TRUE)Salaries of football coaches
Sampling Strategies
What types of samples could we collect? Are some methods “better” than other methods?
At your table…
First
Then
Each table has a sample of 25 UC & CSU coach salaries.
Would you feel comfortable inferring that the median salary of your sample is close to the median salary of all UC & CSU coaches?
Why or why not?
Why sample more than once?
Variability is a central focus of the discipline of Statistics!
Making decisions based on limited information is uncomfortable!
You likely weren’t willing to infer the population median salary from your sample!
Sampling Framework
population – collection of observations / individuals we are interested in
population parameter – numerical summary about the population that is unknown but you wish you knew
sample – a collection of observations from the population
sample statistic – a summary statistic computed from a sample that estimates the unknown population parameter.
Statistical Inference
There were 252 “Head Coaches” at University of California and California State Universities in 2019 (that satisfied my search criteria)
Median salary for all 252 coaches
$137,619
Inferring information from your sample onto the population is called statistical inference.
Statistical Inference Reasoning
Shouldn’t one random sample be enough then? Isn’t that what we use to make confidence intervals and do hypothesis tests?
Virtual Sampling
| Employee Name | Job Title | Total Pay & Benefits |
|---|---|---|
| Beau Baldwin | Asc Head Coach Crd 4 | 708408 |
| Stein Metzger | Intercol Ath Head Coach Ex | 191728 |
| Jordan Wolfrum | Intercol Ath Head Coach Ex | 76597 |
| David Bradley Kreutzkamp | Head Coach 5 | 105683 |
| Daniel Dykes | Head Coach 5 | 540000 |
| Daniel Conners | Head Coach 5 | 156181 |
\(\vdots\)
Distribution of 1000 medians from samples of 25 coaches
Sampling Distributions
Be careful! A sampling distribution is different from a sample’s distribution!
Distributions of 1000 medians from different sample sizes
What differences do you see?
Variability for Different Sample Sizes
| Sample Size | Standard Error of Median |
|---|---|
| 25 | 19343.969 |
| 50 | 12459.358 |
| 100 | 8279.311 |
Standard errors quantify the variability of point estimates
As a general rule, as sample size increases, the standard error decreases.
Careful! There are important differences between standard errors and standard deviations.
A good guess?
Precision & Accuracy
Sampling Activity!